AUC Maximization in Bayesian Hierarchical Models
نویسنده
چکیده
Abstract. The area under the curve (AUC) measures such as the area under the receiver operating characteristics curve (AUROC) and the area under the precision-recall curve (AUPR) are known to be more appropriate than the error rate, especially, for imbalanced data sets. There are several algorithms to optimize AUC measures instead of minimizing the error rate. However, this idea has not been fully exploited in Bayesian hierarchical models owing to the difficulties in inference. Here, we formulate a general Bayesian inference framework, called Bayesian AUCMaximization (BAM), to integrate AUC maximization into Bayesian hierarchical models by borrowing the pairwise and listwise ranking ideas from the information retrieval literature. To showcase our BAM framework, we develop two Bayesian linear classifier variants for two ranking approaches and derive their variational inference procedures. We perform validation experiments on four biomedical data sets to demonstrate the better predictive performance of our framework over its error-minimizing counterpart in terms of average AUROC and AUPR values.
منابع مشابه
Analysis of Hierarchical Bayesian Models for Large Space Time Data of the Housing Prices in Tehran
Housing price data is correlated to their location in different neighborhoods and their correlation is type of spatial (location). The price of housing is varius in different months, so they also have a time correlation. Spatio-temporal models are used to analyze this type of the data. An important purpose of reviewing this type of the data is to fit a suitable model for the spatial-temporal an...
متن کاملParametric Discrete Choice Models Based on the Scale Mixtures of Multivariate Normal Distributions (running Title: Parametric Discrete Choice Models)
SUMMARY. A rich class of parametric models is proposed for discrete choice data based on the scale mixtures of multivariate normal distributions. With special connections to multinomial probit, the new models can be implemented in a Bayesian framework without much diiculty. The proposed class of models can be extended to panel data where accounting for heterogeneities is needed. This is done by...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملComparison of the efficiency of data mining methods in predicting type 2 diabetes
Background: Diabetes mellitus as a chronic disease is the most common disease caused by metabolic disorders and it is one of the most important health issues all around the world. Nowadays, data mining methods are applied in different fields of sciences due to data mining methods capability. Therefore, in this study, we compared the efficiency of data mining methods in predicting type 2 diabete...
متن کاملPresentation of new ensemble method of Bayesian and logistic regression models in landslide susceptibility assessment in the Khalkhal Township
The aim of current research is to assess of landslide susceptibility in the Khalkhal Township, southern Ardabil using an ensemble and new method namely Bayesian and logistic regression (BT-LR) models. At first, landslide inventory map was prepared and then effective factors on landslide occurrence were identified. These factors are slope degree, plan curvature, slope aspect, elevation, landuse,...
متن کامل